AITopics | mean field

Learning in Stackelberg Mean Field Games: ANon-Asymptotic Analysis

Neural Information Processing SystemsJun-23-2026, 03:38:05 GMT

We study policy optimization in Stackelberg mean field games (MFGs), a hierarchical framework for modeling the strategic interaction between a single leader and an infinitely large population of homogeneous followers. The objective can be formulated as a structured bi-level optimization problem, in which the leader needs to learn a policy maximizing its reward, anticipating the response of the followers. Existing methods for solving these (and related) problems often rely on restrictive independence assumptions between the leader's and followers' objectives, use samples inefficiently due to nested-loop algorithm structure, and lack finite-time convergence guarantees. To address these limitations, we propose AC-SMFG, a single-loop actor-critic algorithm that operates on continuously generated Markovian samples. The algorithm alternates between (semi-)gradient updates for the leader, a representative follower, and the mean field, and is simple to implement in practice. We establish the finite-time and finite-sample convergence of the algorithm to a stationary point of the Stackelberg objective. To our knowledge, this is the first Stackelberg MFG algorithm with non-asymptotic convergence guarantees. Our key assumption is a "gradient alignment" condition, which requires that the full policy gradient of the leader can be approximated by a partial component of it, relaxing the existing leader-follower independence assumption. Simulation results in a range of well-established economics environments demonstrate that AC-SMFG outperforms existing multi-agent and MFG learning baselines in policy quality and convergence speed.

artificial intelligence, follower, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Banking & Finance > Trading (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

0b0d29e5d5c8a7a25dced6405bd022a9-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 15:32:02 GMT

We introduce regularized Frank-Wolfe, a general and effective algorithm for inference and learning of dense conditional random fields (CRFs). The algorithm optimizes a nonconvex continuous relaxation of the CRF inference problem using vanilla Frank-Wolfe with approximate updates, which are equivalent to minimizing a regularized energy function. Our proposed method is a generalization of existing algorithms such as mean field or concave-convex procedure. This perspective not only offers a unified analysis of these algorithms, but also allows an easy way of exploring different variants that potentially yield better performance. We illustrate this in our empirical results on standard semantic segmentation datasets, where several instantiations of our regularized Frank-Wolfe outperform mean field inference, both as a standalone component and as an end-to-end trainable layer in a neural network. We also show that dense CRFs, coupled with our new algorithms, produce significant improvements over strong CNN baselines.

artificial intelligence, frank-wolfe, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe (0.67)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

0b0d29e5d5c8a7a25dced6405bd022a9-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 15:31:59 GMT

We introduce regularized Frank-Wolfe, a general and effective algorithm for inference and learning of dense conditional random fields (CRFs). The algorithm optimizes a nonconvex continuous relaxation of the CRF inference problem using vanilla Frank-Wolfe with approximate updates, which are equivalent to minimizing a regularized energy function. Our proposed method is a generalization of existing algorithms such as mean field or concave-convex procedure. This perspective not only offers a unified analysis of these algorithms, but also allows an easy way of exploring different variants that potentially yield better performance. We illustrate this in our empirical results on standard semantic segmentation datasets, where several instantiations of our regularized Frank-Wolfe outperform mean field inference, both as a standalone component and as an end-to-end trainable layer in a neural network. We also show that dense CRFs, coupled with our new algorithms, produce significant improvements over strong CNN baselines.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe (0.68)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

0b0d29e5d5c8a7a25dced6405bd022a9-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 10:53:46 GMT

algorithm, frank-wolfe, proceedings, (16 more...)

Neural Information Processing Systems

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning in Stackelberg Mean Field Games: A Non-Asymptotic Analysis

Zeng, Sihan, Evans, Benjamin Patrick, Bhatt, Sujay, Ardon, Leo, Ganesh, Sumitra, Koppel, Alec

arXiv.org Artificial IntelligenceNov-27-2025

We study policy optimization in Stackelberg mean field games (MFGs), a hierarchical framework for modeling the strategic interaction between a single leader and an infinitely large population of homogeneous followers. The objective can be formulated as a structured bi-level optimization problem, in which the leader needs to learn a policy maximizing its reward, anticipating the response of the followers. Existing methods for solving these (and related) problems often rely on restrictive independence assumptions between the leader's and followers' objectives, use samples inefficiently due to nested-loop algorithm structure, and lack finite-time convergence guarantees. To address these limitations, we propose AC-SMFG, a single-loop actor-critic algorithm that operates on continuously generated Markovian samples. The algorithm alternates between (semi-)gradient updates for the leader, a representative follower, and the mean field, and is simple to implement in practice. We establish the finite-time and finite-sample convergence of the algorithm to a stationary point of the Stackelberg objective. To our knowledge, this is the first Stackelberg MFG algorithm with non-asymptotic convergence guarantees. Our key assumption is a "gradient alignment" condition, which requires that the full policy gradient of the leader can be approximated by a partial component of it, relaxing the existing leader-follower independence assumption. Simulation results in a range of well-established economics environments demonstrate that AC-SMFG outperforms existing multi-agent and MFG learning baselines in policy quality and convergence speed.

artificial intelligence, follower, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.15392

Genre: Research Report > New Finding (0.45)

Industry: Banking & Finance > Trading (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues

Neural Information Processing SystemsNov-20-2025, 23:06:27 GMT

Variational approximation has been widely used in large-scale Bayesian inference recently, the simplest kind of which involves imposing a mean field assumption to approximate complicated latent structures. Despite the computational scalability of mean field, theoretical studies of its loss function surface and the convergence behavior of iterative updates for optimizing the loss are far from complete. In this paper, we focus on the problem of community detection for a simple two-class Stochastic Blockmodel (SBM). Using batch co-ordinate ascent (BCAVI) for updates, we give a complete characterization of all the critical points and show different convergence behaviors with respect to initializations. When the parameters are known, we show a significant proportion of random initializations will converge to ground truth. On the other hand, when the parameters themselves need to be estimated, a random initialization will converge to an uninformative local optimum.

name change, optimization landscape and convergence issue, stochastic blockmodel, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.41)

Add feedback

From Individual Learning to Market Equilibrium: Correcting Structural and Parametric Biases in RL Simulations of Economic Models

Chen, Ruxin, Zhang, Zeqiang

arXiv.org Artificial IntelligenceOct-21-2025

The application of Reinforcement Learning (RL) to economic modeling reveals a fundamental conflict between the assumptions of equilibrium theory and the emergent behavior of learning agents. While canonical economic models assume atomistic agents act as `takers' of aggregate market conditions, a naive single-agent RL simulation incentivizes the agent to become a `manipulator' of its environment. This paper first demonstrates this discrepancy within a search-and-matching model with concave production, showing that a standard RL agent learns a non-equilibrium, monopsonistic policy. Additionally, we identify a parametric bias arising from the mismatch between economic discounting and RL's treatment of intertemporal costs. To address both issues, we propose a calibrated Mean-Field Reinforcement Learning framework that embeds a representative agent in a fixed macroeconomic field and adjusts the cost function to reflect economic opportunity costs. Our iterative algorithm converges to a self-consistent fixed point where the agent's policy aligns with the competitive equilibrium. This approach provides a tractable and theoretically sound methodology for modeling learning agents in economic systems within the broader domain of computational social science.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2507.18229

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-9-2025, 14:07:47 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper introduces a full model for tracking while allowing for multiple and varying number of hypothesis and clutter. It promises a clear notation and fast algorithms through the use of variational/Baum-Welch type inference. Experiments appear extensive and are performed on real-world data. The key novelty of this paper is the assignment problem (aka data association). Tracking itself, as the authors acknowledge, is a well-trodden field.

algorithm, approximation, originality and significance, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.71)

Add feedback

Networked Communication for Decentralised Cooperative Agents in Mean-Field Control

Benjamin, Patrick, Abate, Alessandro

arXiv.org Artificial IntelligenceMar-12-2025

We introduce networked communication to mean-field control (MFC) - the cooperative counterpart to mean-field games (MFGs) - and in particular to the setting where decentralised agents learn online from a single, non-episodic run of the empirical system. We adapt recent algorithms for MFGs to this new setting, as well as contributing a novel sub-routine allowing networked agents to estimate the global average reward from their local neighbourhood. We show that the networked communication scheme allows agents to increase social welfare faster than under both the centralised and independent architectures, by computing a population of potential updates in parallel and then propagating the highest-performing ones through the population, via a method that can also be seen as tackling the credit-assignment problem. We prove this new result theoretically and provide experiments that support it across numerous games, as well as exploring the empirical finding that smaller communication radii can benefit convergence in a specific class of game while still outperforming agents learning entirely independently. We provide numerous ablation studies and additional experiments on numbers of communication round and robustness to communication failures.

agent, algorithm, benjamin & abate, (11 more...)

arXiv.org Artificial Intelligence

2503.094

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: CRF-CNN: Modeling Structured Information in Human Pose Estimation

Neural Information Processing SystemsJan-20-2025, 13:05:52 GMT

The general idea of this work is clearly in a direction of interest to the community and the results look strong. However, there are a few aspects to this work that I find quite unclear. If they were clearer this work would have much more potential for impact. In particular, it is not clear enough if this work uses a truly'end-to-end' approach (as stated for one of the contributions). For example on line 46 it is stated that: "We show step by step how approximations are made to use an end-to-end learning CNN for implementing such CRF model."

artificial intelligence, machine learning, modeling structured information, (11 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

Filters

Collaborating Authors

mean field

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Learning in Stackelberg Mean Field Games: ANon-Asymptotic Analysis

0b0d29e5d5c8a7a25dced6405bd022a9-Supplemental.pdf

0b0d29e5d5c8a7a25dced6405bd022a9-Paper.pdf

0b0d29e5d5c8a7a25dced6405bd022a9-Paper.pdf

Learning in Stackelberg Mean Field Games: A Non-Asymptotic Analysis

Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues

From Individual Learning to Market Equilibrium: Correcting Structural and Parametric Biases in RL Simulations of Economic Models

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Networked Communication for Decentralised Cooperative Agents in Mean-Field Control

Reviews: CRF-CNN: Modeling Structured Information in Human Pose Estimation